Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cram view gs cloud streaming. #92

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

DailyDreaming
Copy link
Member

No description provided.

@DailyDreaming DailyDreaming requested a review from xbrianh January 2, 2021 08:35
@DailyDreaming DailyDreaming self-assigned this Jan 2, 2021
@xbrianh
Copy link
Member

xbrianh commented Jan 8, 2021

Hi @DailyDreaming, this is an interesting approach! However, before diving in can you comment on the following?:

  • xsamtools already has built in support for FIFO pipes over gs blobs. It would be preferable to use that here, if possible, unless there’s a reason to prefer this method.
  • The main reason to use cloud slicing is to download only the interesting parts of the CRAM file. However, using a stream seems to force the download of the entire file. Or at least all data from the beginning of the file to the portion of interest, which may be a lot. What is the benefit of streaming for this use case?

@DailyDreaming
Copy link
Member Author

@xbrianh I'll replace this with the xsamtools method. This will still be used to cover the case where the entire file needs to be streamed in (no or all region args).

This ticket doesn't implement slicing, so by necessity, it's an implementation that forces the download of the whole file. Which is still a valid use case and slicing will only be implemented when a subset of regions is given, otherwise all regions means the entire file. This implementation should be strictly more efficient than the current implementation, since the current implementation downloads/writes the file, then hands it to samtools, which writes another file. Streaming saves the IO for an entire step.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants